38 research outputs found

    A 64mW DNN-based Visual Navigation Engine for Autonomous Nano-Drones

    Full text link
    Fully-autonomous miniaturized robots (e.g., drones), with artificial intelligence (AI) based visual navigation capabilities are extremely challenging drivers of Internet-of-Things edge intelligence capabilities. Visual navigation based on AI approaches, such as deep neural networks (DNNs) are becoming pervasive for standard-size drones, but are considered out of reach for nanodrones with size of a few cm2{}^\mathrm{2}. In this work, we present the first (to the best of our knowledge) demonstration of a navigation engine for autonomous nano-drones capable of closed-loop end-to-end DNN-based visual navigation. To achieve this goal we developed a complete methodology for parallel execution of complex DNNs directly on-bard of resource-constrained milliwatt-scale nodes. Our system is based on GAP8, a novel parallel ultra-low-power computing platform, and a 27 g commercial, open-source CrazyFlie 2.0 nano-quadrotor. As part of our general methodology we discuss the software mapping techniques that enable the state-of-the-art deep convolutional neural network presented in [1] to be fully executed on-board within a strict 6 fps real-time constraint with no compromise in terms of flight results, while all processing is done with only 64 mW on average. Our navigation engine is flexible and can be used to span a wide performance range: at its peak performance corner it achieves 18 fps while still consuming on average just 3.5% of the power envelope of the deployed nano-aircraft.Comment: 15 pages, 13 figures, 5 tables, 2 listings, accepted for publication in the IEEE Internet of Things Journal (IEEE IOTJ

    Vision-State Fusion: Improving Deep Neural Networks for Autonomous Robotics

    Full text link
    Vision-based perception tasks fulfill a paramount role in robotics, facilitating solutions to many challenging scenarios, such as acrobatics maneuvers of autonomous unmanned aerial vehicles (UAVs) and robot-assisted high precision surgery. Most control-oriented and egocentric perception problems are commonly solved by taking advantage of the robot state estimation as an auxiliary input, particularly when artificial intelligence comes into the picture. In this work, we propose to apply a similar approach for the first time - to the best of our knowledge - to allocentric perception tasks, where the target variables refer to an external subject. We prove how our general and intuitive methodology improves the regression performance of deep convolutional neural networks (CNNs) with ambiguous problems such as the allocentric 3D pose estimation. By analyzing three highly-different use cases, spanning from grasping with a robotic arm to following a human subject with a pocket-sized UAV, our results consistently improve the R2 metric up to +0.514 compared to their stateless baselines. Finally, we validate the in-field performance of a closed-loop autonomous pocket-sized UAV in the human pose estimation task. Our results show a significant reduction, i.e., 24% on average, on the mean absolute error of our stateful CNN.Comment: 8 pages, 8 figure

    Land & Localize: An Infrastructure-free and Scalable Nano-Drones Swarm with UWB-based Localization

    Full text link
    Relative localization is a crucial functional block of any robotic swarm. We address it in a fleet of nano-drones characterized by a 10 cm-scale form factor, which makes them highly versatile but also strictly limited in their onboard power envelope. State-of-the-Art solutions leverage Ultra-WideBand (UWB) technology, allowing distance range measurements between peer nano-drones and a stationary infrastructure of multiple UWB anchors. Therefore, we propose an UWB-based infrastructure-free nano-drones swarm, where part of the fleet acts as dynamic anchors, i.e., anchor-drones (ADs), capable of automatic deployment and landing. By varying the Ads' position constraint, we develop three alternative solutions with different trade-offs between flexibility and localization accuracy. In-field results, with four flying mission-drones (MDs), show a localization root mean square error (RMSE) spanning from 15.3 cm to 27.8 cm, at most. Scaling the number of MDs from 4 to 8, the RMSE marginally increases, i.e., less than 10 cm at most. The power consumption of the MDs' UWB module amounts to 342 mW. Ultimately, compared to a fixed-infrastructure commercial solution, our infrastructure-free system can be deployed anywhere and rapidly by taking 5.7 s to self-localize 4 ADs with a localization RMSE of up to 12.3% in the most challenging case with 8 MDs

    Cyber Security aboard Micro Aerial Vehicles: An OpenTitan-based Visual Communication Use Case

    Full text link
    Autonomous Micro Aerial Vehicles (MAVs), with a form factor of 10cm in diameter, are an emerging technology thanks to the broad applicability enabled by their onboard intelligence. However, these platforms are strongly limited in the onboard power envelope for processing, i.e., less than a few hundred mW, which confines the onboard processors to the class of simple microcontroller units (MCUs). These MCUs lack advanced security features opening the way to a wide range of cyber security vulnerabilities, from the communication between agents of the same fleet to the onboard execution of malicious code. This work presents an open source System on Chip (SoC) design that integrates a 64 bit Linux capable host processor accelerated by an 8 core 32 bit parallel programmable accelerator. The heterogeneous system architecture is coupled with a security enclave based on an open source OpenTitan root of trust. To demonstrate our design, we propose a use case where OpenTitan detects a security breach on the SoC aboard the MAV and drives its exclusive GPIOs to start a LED blinking routine. This procedure embodies an unconventional visual communication between two palm sized MAVs: the receiver MAV classifies the LED state of the sender (on or off) with an onboard convolutional neural network running on the parallel accelerator. Then, it reconstructs a high-level message in 1.3s, 2.3 times faster than current commercial solutions

    Sound Event Detection with Binary Neural Networks on Tightly Power-Constrained IoT Devices

    Full text link
    Sound event detection (SED) is a hot topic in consumer and smart city applications. Existing approaches based on Deep Neural Networks are very effective, but highly demanding in terms of memory, power, and throughput when targeting ultra-low power always-on devices. Latency, availability, cost, and privacy requirements are pushing recent IoT systems to process the data on the node, close to the sensor, with a very limited energy supply, and tight constraints on the memory size and processing capabilities precluding to run state-of-the-art DNNs. In this paper, we explore the combination of extreme quantization to a small-footprint binary neural network (BNN) with the highly energy-efficient, RISC-V-based (8+1)-core GAP8 microcontroller. Starting from an existing CNN for SED whose footprint (815 kB) exceeds the 512 kB of memory available on our platform, we retrain the network using binary filters and activations to match these memory constraints. (Fully) binary neural networks come with a natural drop in accuracy of 12-18% on the challenging ImageNet object recognition challenge compared to their equivalent full-precision baselines. This BNN reaches a 77.9% accuracy, just 7% lower than the full-precision version, with 58 kB (7.2 times less) for the weights and 262 kB (2.4 times less) memory in total. With our BNN implementation, we reach a peak throughput of 4.6 GMAC/s and 1.5 GMAC/s over the full network, including preprocessing with Mel bins, which corresponds to an efficiency of 67.1 GMAC/s/W and 31.3 GMAC/s/W, respectively. Compared to the performance of an ARM Cortex-M4 implementation, our system has a 10.3 times faster execution time and a 51.1 times higher energy-efficiency.Comment: 6 pages conferenc

    On the Autonomous Navigation of Nano-UAVs

    No full text
    Whether we like it or not, flying robots are rapidly evolving, getting ever smaller. Pioneering research groups have achieved fully-functional insect-scale unmanned aerial vehicles – i.e., pico-size UAVs. This new wave of innovation has huge economic and social game-changing potentialities, enabling a vast number of visionary applications. However, there are still several challenges preventing us from entering into the insect-size intelligent robot era. In this thesis, we aim at getting beyond, what we call, the “wall of onboard intelligence”. Nowadays, relatively big-sized drones (i.e., standard- micro-size UAVs) have been proven capable of impressive autonomous sense-and-act capabilities, but this intelligence comes at the price of onboard bulky and power-hungry computational units. These high-end devices, capable of a tremendous peak throughput of tens of TOps/s within tens of Watts, are unaffordable for small-size UAVs, as their power budget and available payload are severely limited. To date, this “wall of onboard intelligence” stands at the nano-size class of vehicle, characterized by a mass of few tens of grams, and a computational power budget in the range of a few hundred mW. Therefore, this thesis tries to answer the fundamental question of how to bring state-of-the-art autonomous navigation capabilities aboard nano- (and potentially pico-) size UAVs. The thesis starts investigating on the hardware/software key enablers for high-level energy efficiency and cutting down the computational requirements of navigation workloads, without compromising the overall quality of results. As a domain-specific example application, we address a global path planner. Our findings demonstrate that i) heterogeneous hardware architectures, ii) the parallel computing paradigm, and iii) approximate computing techniques can enable more than an order-of-magnitude improvement on energy efficiency. Additionally, we show that the introduction of approximate computing in a cyber-physical system (CPS), under strict real-time constraints, implies a “semantic shift” of the same paradigm. Traditionally, approximate computing trade-off numerical accuracy in favor of higher energy efficiency; in this context, the gain, due to the “simplified” computation, can also turn in a higher quality of the mission (e.g., reduced response time and increased reactivity). The second part of the thesis branches our energy investigation to focus on the system level, aiming at lifetime maximization of nano-size UAVs. Selecting a nano-blimp platform, consisting of a helium balloon and a rotorcraft, we show how the lifetime of the CPS can be significantly extended by i) dynamically adapting the motor control, ii) duty-cycling high power actuators, and iii) adding solar harvesting. According to our study, even in modest lighting conditions, the ultimate limiting factor for a self-sustainable nano-blimp is the balloon’s deflation rate. In the third part of the thesis, we explore the applicability of the heterogeneous model in the ultra-low power (ULP) domain – i.e., representative of nano- pico-size UAVs power budget. We demonstrate the feasibility of coupling a low power Cortex-M microcontroller with a programmable parallel ULP accelerator (PULP) for speeding-up computation-intensive algorithms at a milliwatt-scale budget. To further validate the proposed heterogeneous model, we extend our analysis to two concrete example applications representative of the autonomous navigation domain: a visual servoing and a visual odometry pipeline. Our evaluation shows that a speedup of more than an order-of-magnitude is achievable vs. single-core MCUs in a ULP setting, without compromising the platform’s programmability, guaranteeing real-time performance, and leaving enough computational bandwidth to execute additional onboard tasks. The last part of the thesis is devoted to combining all the previous basic building blocks with a novel bio-inspired class of autonomous navigation algorithms. Leveraging the DroNet convolutional neural network (CNN), we developed a complete deployment methodology targeted at enabling the execution of complex CNNs directly aboard resource-constrained milliwatt-scale nodes. We achieve energy-efficient real-time processing of DroNet on board a nano-UAV, with a peak throughput of ∌1.5 GOps/s, and energy efficiency up to ∌7.7 GOps/s/W. Our field-proven, closed-loop, autonomous nano-UAV, can follow a street/corridor, being at the same time capable of preventing collisions with dynamic obstacles. We answer our initial question creating – to the best of our knowledge – the first example of a new intelligent nano-size species of UAVs

    Exploring Single Source Shortest Path Parallelization on Shared Memory Accelerators

    No full text
    Single Source Shortest Path (SSSP) algorithms are widely used in embedded systems for several applications. The emerging trend towards the adoption of heterogeneous designs in embedded devices, where low-power parallel accelerators are coupled to the main processor, opens new opportunities to deliver superior performance/watt, but calls for efficient parallel SSSP implementation. In this work we provide a detailed exploration of the \uce\u94-stepping algorithm performance on a representative heterogeneous embedded system, TI Keystone II, considering the impact of several parallelization parameters (threading, load balancing, synchronization)

    Ultra low-power visual odometry for nano-scale unmanned aerial vehicles

    No full text
    One of the fundamental functionalities for autonomous navigation of Unmanned Aerial Vehicles (UAVs) is the hovering capability. State-of-the-art techniques for implementing hovering on standard-size UAVs process camera stream to determine position and orientation (visual odometry). Similar techniques are considered unaffordable in the context of nano-scale UAVs (i.e. few centimeters of diameter), where the ultra-constrained power-envelopes of tiny rotor-crafts limit the onboard computational capabilities to those of low-power microcontrollers. In this work we study how the emerging ultra-low-power parallel computing paradigm could enable the execution of complex hovering algorithmic flows onto nano-scale UAVs. We provide insight on the software pipeline, the parallelization opportunities and the impact of several algorithmic enhancements. Results demonstrate that the proposed software flow and architecture can deliver unprecedented GOPS/W, achieving 117 frame-per-second within a power envelope of 10 mW
    corecore